3 research outputs found
t-Exponential Memory Networks for Question-Answering Machines
Recent advances in deep learning have brought to the fore models that can
make multiple computational steps in the service of completing a task; these
are capable of describ- ing long-term dependencies in sequential data. Novel
recurrent attention models over possibly large external memory modules
constitute the core mechanisms that enable these capabilities. Our work
addresses learning subtler and more complex underlying temporal dynamics in
language modeling tasks that deal with sparse sequential data. To this end, we
improve upon these recent advances, by adopting concepts from the field of
Bayesian statistics, namely variational inference. Our proposed approach
consists in treating the network parameters as latent variables with a prior
distribution imposed over them. Our statistical assumptions go beyond the
standard practice of postulating Gaussian priors. Indeed, to allow for handling
outliers, which are prevalent in long observed sequences of multivariate data,
multivariate t-exponential distributions are imposed. On this basis, we proceed
to infer corresponding posteriors; these can be used for inference and
prediction at test time, in a way that accounts for the uncertainty in the
available sparse training data. Specifically, to allow for our approach to best
exploit the merits of the t-exponential family, our method considers a new
t-divergence measure, which generalizes the concept of the Kullback-Leibler
divergence. We perform an extensive experimental evaluation of our approach,
using challenging language modeling benchmarks, and illustrate its superiority
over existing state-of-the-art techniques
Natural Language Processing with Deep Neural Networks
Since the very first days of the Computer Era, machines provided us the
ability to collect and store vast amount of information. It, soon, became obvious
that harvesting information was an entirely different task, much more
complicated and demanding. Many solutions developed to make it possible for
humans to communicate with computers, aiming data mining. Databases, query
languages and search engines were for decades the most prevalent solution. At
first, special skills were required such as query language knowledge or search
engine’s syntax to be able to perform advanced tasks. In order to be adopted
by users, search should be easy and human friendly. Nowadays, search engines
are using simple language syntax and have made significant progress towards
natural language path.
Still, search engines fall sort on combining information from different parts
producing a synthetic answer. Computers are actually computational machines,
therefore are excellent at manipulating syntax and calculating words’ frequency
but are weak in recognizing concepts behind the words. A traditional search
engine, is not able to draw conclusions or realize the context of a dialogue.
Machine Learning has been proven strong in dealing with such concepts.
One of the most challenging fields of machine learning is the Natural Language
Processing (NLP), especially its component Natural Language Understanding
(NLU). The crest of NLP are the question-answering and summarization tasks,
in sense that strong cognitive ability is required in order the conceptual context
to be extracted. Supervised learning of deep neural networks, is currently the
best available tool for these tasks. Despite the rapid advances in the field of
Machine Learning, their performance remains poor when dealing with hard NLU
and NLP tasks, such as abstractive summarisation and question answering.
This dissertation aims to offer substantive and measurable progress in both
these areas, by ameliorating a key problem of modern machine learning techniques:
The need for dense and large data corpora for effective model training.
This is an especially hard task in the context of such applications. To this end,
we leverage arguments from the field of Bayesian inference. This allows for better
dealing with the modelling uncertainty, which is the direct outcome of data
sparsity, and results in poor modelling/generalisation performance. Our approaches
are founded upon solid and elaborate statistical inference arguments,
and are evaluated using challenging popular benchmarks. As we show, they
offer tangible performance advantages over the state-of-the-art.Sergios Theodoridis, Nicolas TsapatsoulisComplete
Dialog Speech Sentiment Classification for Imbalanced Datasets
Speech is the most common way humans express their feelings, and sentiment analysis is the use of tools such as natural language processing and computational algorithms to identify the polarity of these feelings. Even though this field has seen tremendous advancements in the last two decades, the task of effectively detecting under represented sentiments in different kinds of datasets is still a challenging task. In this paper, we use single and bi-modal analysis of short dialog utterances and gain insights on the main factors that aid in sentiment detection, particularly in the underrepresented classes, in datasets with and without inherent sentiment component. Furthermore, we propose an architecture which uses a learning rate scheduler and different monitoring criteria and provides state-of-the-art results for the SWITCHBOARD imbalanced sentiment dataset